1 research outputs found
Protecting the Future of Information: LOCO Coding With Error Detection for DNA Data Storage
DNA strands serve as a storage medium for -ary data over the alphabet
. DNA data storage promises formidable information density,
long-term durability, and ease of replicability. However, information in this
intriguing storage technology might be corrupted. Experiments have revealed
that DNA sequences with long homopolymers and/or with low -content are
notably more subject to errors upon storage.
This paper investigates the utilization of the recently-introduced method for
designing lexicographically-ordered constrained (LOCO) codes in DNA data
storage. This paper introduces DNA LOCO (D-LOCO) codes, over the alphabet
with limited runs of identical symbols. These codes come with an
encoding-decoding rule we derive, which provides affordable encoding-decoding
algorithms. In terms of storage overhead, the proposed encoding-decoding
algorithms outperform those in the existing literature. Our algorithms are
readily reconfigurable. D-LOCO codes are intrinsically balanced, which allows
us to achieve balancing over the entire DNA strand with minimal rate penalty.
Moreover, we propose four schemes to bridge consecutive codewords, three of
which guarantee single substitution error detection per codeword. We examine
the probability of undetecting errors. We also show that D-LOCO codes are
capacity-achieving and that they offer remarkably high rates at moderate
lengths.Comment: 14 pages (double column), 3 figures, submitted to the IEEE
Transactions on Molecular, Biological and Multi-scale Communications (TMBMC